{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "%matplotlib inline"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Lift Score"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Scoring function to compute the LIFT metric, the ratio of correctly predicted positive examples and the actual positive examples in the test dataset."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> `from mlxtend.evaluate import lift_score`    "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Overview"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In the context of classification, *lift* [1] compares model predictions to randomly generated predictions. Lift is often used in marketing research combined with *gain and lift* charts as a visual aid [2]. For example, assuming a 10% customer response as a baseline, a lift value of 3 would correspond to a 30% customer response when using the predictive model. Note that *lift* has the range $\\lbrack 0, \\infty \\rbrack$.\n",
    "\n",
    "There are many strategies to compute *lift*, and below, we will illustrate the computation of the lift score using a classic confusion matrix. For instance, let's assume the following prediction and target labels, where \"1\" is the positive class:\n",
    "\n",
    "- $\\text{true labels}: [0, 0, 1, 0, 0, 1, 1, 1, 1, 1]$\n",
    "- $\\text{prediction}: [1, 0, 1, 0, 0, 0, 0, 1, 0, 0]$\n",
    "\n",
    "Then, our confusion matrix would look as follows:\n",
    "\n",
    "![](./lift_score_files/lift_cm_1.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Based on the confusion matrix above, with \"1\" as positive label, we compute *lift* as follows:\n",
    "\n",
    "$$\n",
    "\\text{lift} = \\frac{(TP/(TP+FP)}{(TP+FN)/(TP+TN+FP+FN)}\n",
    "$$\n",
    "\n",
    "Plugging in the actual values from the example above, we arrive at the following lift value:\n",
    "\n",
    "$$\n",
    "\\frac{2/(2+1)}{(2+4)/(2+3+1+4)} = 1.1111111111111112\n",
    "$$\n",
    "\n",
    "An alternative way to computing lift is by using the *support* metric [3]:\n",
    "\n",
    "$$\n",
    "\\text{lift} = \\frac{\\text{support}(\\text{true labels} \\cap \\text{prediction})}{\\text{support}(\\text{true labels}) \\times \\text{support}(\\text{prediction})},\n",
    "$$\n",
    "\n",
    "Support is $x / N$, where $x$ is the number of incidences of an observation and $N$ is the total number of samples in the datset. $\\text{true labels} \\cap \\text{prediction}$ are the true positives, $true labels$ are true positives plus false negatives, and $prediction$ are true positives plus false positives. Plugging the values from our example into the equation above, we arrive at:\n",
    "\n",
    "$$\n",
    "\\frac{2/10}{(6/10 \\times 3/10)} = 1.1111111111111112\n",
    "$$\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### References\n",
    "\n",
    "- [1]  S. Brin, R. Motwani, J. D. Ullman, and S. Tsur. [Dynamic itemset counting and implication rules for market basket data](http://dl.acm.org/citation.cfm?id=253325). In Proc. of the ACM SIGMOD Int'l Conf. on Management of Data (ACM SIGMOD '97), pages 265-276, 1997.\n",
    "- [2] https://www3.nd.edu/~busiforc/Lift_chart.html\n",
    "- [3] https://en.wikipedia.org/wiki/Association_rule_learning#Support"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Example 1 - Computing Lift"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This examples demonstrates the basic use of the `lift_score` function using the example from the *Overview* section."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "1.1111111111111112"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import numpy as np\n",
    "from mlxtend.evaluate import lift_score\n",
    "\n",
    "y_target =    np.array([0, 0, 1, 0, 0, 1, 1, 1, 1, 1])\n",
    "y_predicted = np.array([1, 0, 1, 0, 0, 0, 0, 1, 0, 0])\n",
    "\n",
    "lift_score(y_target, y_predicted)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Example 2 - Using `lift_score` in `GridSearch`"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The `lift_score` function can also be used with scikit-learn objects, such as `GridSearch`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "3.0\n",
      "{'gamma': 0.001, 'kernel': 'rbf', 'C': 1000}\n"
     ]
    }
   ],
   "source": [
    "from sklearn.datasets import load_iris\n",
    "from sklearn.model_selection import train_test_split\n",
    "from sklearn.model_selection import GridSearchCV\n",
    "from sklearn.svm import SVC\n",
    "from sklearn.metrics import make_scorer\n",
    "\n",
    "# make custom scorer\n",
    "lift_scorer = make_scorer(lift_score)\n",
    "\n",
    "\n",
    "iris = load_iris()\n",
    "X, y = iris.data, iris.target\n",
    "\n",
    "X_train, X_test, y_train, y_test = train_test_split(\n",
    "    X, y, test_size=0.2, stratify=y, random_state=123)\n",
    "\n",
    "hyperparameters = [{'kernel': ['rbf'], 'gamma': [1e-3, 1e-4],\n",
    "                     'C': [1, 10, 100, 1000]},\n",
    "                   {'kernel': ['linear'], 'C': [1, 10, 100, 1000]}]\n",
    "\n",
    "clf = GridSearchCV(SVC(), hyperparameters, cv=10,\n",
    "                   scoring=lift_scorer)\n",
    "clf.fit(X_train, y_train)\n",
    "\n",
    "print(clf.best_score_)\n",
    "print(clf.best_params_)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## API"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "## lift_score\n",
      "\n",
      "*lift_score(y_target, y_predicted, binary=True, positive_label=1)*\n",
      "\n",
      "Lift measures the degree to which the predictions of a\n",
      "classification model are better than randomly-generated predictions.\n",
      "\n",
      "The in terms of True Positives (TP), True Negatives (TN),\n",
      "False Positives (FP), and False Negatives (FN), the lift score is\n",
      "computed as:\n",
      "[ TP/(TP+FN) ] / [ (TP+FP) / (TP+TN+FP+FN) ]\n",
      "\n",
      "\n",
      "**Parameters**\n",
      "\n",
      "- `y_target` : array-like, shape=[n_samples]\n",
      "\n",
      "    True class labels.\n",
      "\n",
      "- `y_predicted` : array-like, shape=[n_samples]\n",
      "\n",
      "    Predicted class labels.\n",
      "\n",
      "- `binary` : bool (default: True)\n",
      "\n",
      "    Maps a multi-class problem onto a\n",
      "    binary, where\n",
      "    the positive class is 1 and\n",
      "    all other classes are 0.\n",
      "\n",
      "- `positive_label` : int (default: 0)\n",
      "\n",
      "    Class label of the positive class.\n",
      "\n",
      "**Returns**\n",
      "\n",
      "- `score` : float\n",
      "\n",
      "    Lift score in the range [0, $\\infty$]\n",
      "\n",
      "**Examples**\n",
      "\n",
      "For usage examples, please see\n",
      "    [http://rasbt.github.io/mlxtend/user_guide/evaluate/lift_score/](http://rasbt.github.io/mlxtend/user_guide/evaluate/lift_score/)\n",
      "\n",
      "\n"
     ]
    }
   ],
   "source": [
    "with open('../../api_modules/mlxtend.evaluate/lift_score.md', 'r') as f:\n",
    "    s = f.read() \n",
    "print(s)"
   ]
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.4"
  },
  "toc": {
   "nav_menu": {},
   "number_sections": true,
   "sideBar": true,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": false,
   "toc_position": {},
   "toc_section_display": true,
   "toc_window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
